309. Best Time to Buy and Sell Stock with Cooldown
Medium

You are given an array prices where prices[i] is the price of a given stock on the ith day.

Find the maximum profit you can achieve. You may complete as many transactions as you like (i.e., buy one and sell one share of the stock multiple times) with the following restrictions:

  • After you sell your stock, you cannot buy stock on the next day (i.e., cooldown one day).

Note: You may not engage in multiple transactions simultaneously (i.e., you must sell the stock before you buy again).

 

Example 1:

Input: prices = [1,2,3,0,2]
Output: 3
Explanation: transactions = [buy, sell, cooldown, buy, sell]

Example 2:

Input: prices = [1]
Output: 0

 

Constraints:

  • 1 <= prices.length <= 5000
  • 0 <= prices[i] <= 1000
Accepted
198,862
Submissions
407,281
Seen this question in a real interview before?
Companies
Related Topics

Average Rating: 4.89 (90 votes)

Premium

Solution


Overview

First of all, we would like to mention that this is yet another problem from the series of Best-Time-to-Buy-and-Sell-Stock problems, which we list as follows:

One could try to resolve them one by one, which certainly could help with this problem.

There have been quite some excellent posts in the Discussion forum. We would like to mention that the user fun4LeetCode even developed a mathematical representation that is able to be generalized to each of the problems.

That being said, here we contribute some approaches, which hopefully could provide you different perspectives for the problem.

As one might have seen the hint from the problem description, which says "dynamic programming" (i.e. DP), we could tackle this problem mainly with the technique called dynamic programming.

Often the case, in order to come up with a dynamic programming solution, it would be beneficial to draw down some mathematical formulas to model the problem.

As a reminder, the nature of dynamic programming is to break the original problem into several subproblems, and then reuse the results of subproblems for the original problem.

Therefore, due to the nature of DP, the mathematical formulas that we should come up with would almost certainly assume the form of recursion.

Before embarking on the next sections of this article, we kindly ask the audiences to keep an open mind, fasten your seat belts and enjoy the ride with a heavy (yet healthy) dose of mathematical formulas.


Approach 1: Dynamic Programming with State Machine

Intuition

First of all, let us take a different perspective to look at the problem, unlike the other algorithmic problems.

Here, we will treat the problem as a game, and the trader as an agent in the game. The agent can take actions that lead to gain or lose of game points (i.e. profits). And the goal of the game for the agent is to gain the maximal points.

In addition, we will introduce a tool called state machine, which is a mathematical model of computation. Later one will see how the state machine coupled with the dynamic programming technique can help us solve the problem easily.

In the following sections, we will first define a state machine that is used to model the behaviors and states of the game agent.

Then, we will demonstrate how to apply the state machine to solve the problem.

Definition

Let us define a state machine to model our agent. The state machine consists of three states, which we define as follows:

  • state held: in this state, the agent holds a stock that it bought at some point before.

  • state sold: in this state, the agent has just sold a stock right before entering this state. And the agent holds no stock at hand.

  • state reset: first of all, one can consider this state as the starting point, where the agent holds no stock and did not sell a stock before. More importantly, it is also the transient state before the held and sold. Due to the cooldown rule, after the sold state, the agent can not immediately acquire any stock, but is forced into the reset state. One can consider this state as a "reset" button for the cycles of buy and sell transactions.

At any moment, the agent can only be in one state. The agent would transition to another state by performing some actions, namely:

  • action sell: the agent sells a stock at the current moment. After this action, the agent would transition to the sold state.

  • action buy: the agent acquires a stock at the current moment. After this action, the agent would transition to the held state.

  • action rest: this is the action that the agent does no transaction, neither buy or sell. For instance, while holding a stock at the held state, the agent might simply do nothing, and at the next moment the agent would remain in the held state.

Now, we can assemble the above states and actions into a state machine, which we show in the following graph where each node represents a state, and each edge represents a transition between two states. On top of each edge, we indicate the action that triggers the transition.

state machine

Notice that, in all states except the sold state, by doing nothing, we would remain in the same state, which is why there is a self-looped transition on these states.

Deduction

Now, one might wonder how exactly the state machine that we defined can help to solve the problem.

As we mentioned before, we model the problem as a game, and the trader as an agent in the game. And this is where our state machine comes into the picture. The behaviors and the states of the game agent can be modeled by our state machine.

mario game

Given a list stock prices (i.e. price[0...n]), our agent would walk through each price point one by one. At each point, the agent would be in one of three states (i.e. held, sold and reset) that we defined before. And at each point, the agent would take one of the three actions (i.e. buy, sell and rest), which then would lead to the next state at the next price point.

Now if we chain up each state at each price point, it would form a graph where each path that starts from the initial price point and ends at the last price point represents a combination of transactions that the agent could perform through out the game.

graph of state transition

The above graph shows all possible paths that our game agent agent walks through the list, which corresponds to all possible combinations of transactions that the trader can perform with the given price sequence.

In order to solve the problem, the goal is to find such a path in the above graph that maximizes the profits.

In each node of graph, we also indicate the maximal profits that the agent has gained so far in each state of each step. And we highlight the path that generates the maximal profits. Don't worry about them for the moment. We will explain in detail how to calculate in the next section.

Algorithm

In order to implement the above state machine, we could define three arrays (i.e. held[i], sold[i] and reset[i]) which correspond to the three states that we defined before.

Each element in each array represents the maximal profits that we could gain at the specific price point i with the specific state. For instance, the element sold[2] represents the maximal profits we gain if we sell the stock at the price point price[2].

According to the state machine we defined before, we can then deduce the formulas to calculate the values for the state arrays, as follows:

sold[i]=hold[i1]+price[i]held[i]=max(held[i1],reset[i1]price[i])reset[i]=max(reset[i1],sold[i1]) \text{sold}[i] = \text{hold}[i-1] + \text{price}[i] \\ \text{held}[i] = \max{(\text{held}[i-1], \quad \text{reset}[i-1] - \text{price}[i])} \\ \text{reset}[i] = \max{(\text{reset}[i-1], \quad \text{sold}[i-1])}

Here is how we interpret each formulas:

  • sold[i]\text{sold}[i]: the previous state of sold can only be held. Therefore, the maximal profits of this state is the maximal profits of the previous state plus the revenue by selling the stock at the current price.

  • held[i]\text{held}[i]: the previous state of held could also be held, i.e. one does no transaction. Or its previous state could be reset, from which state, one can acquire a stock at the current price point.

  • reset[i]\text{reset}[i]: the previous state of reset could either be reset or sold. Both transitions do not involve any transaction with the stock.

Finally, the maximal profits that we can gain from this game would be max(sold[n],reset[n])\max{(\text{sold}[n], \text{reset}[n])}, i.e. at the last price point, either we sell the stock or we simply do no transaction, to have the maximal profits. It makes no sense to acquire the stock at the last price point, which only leads to the reduction of profits.

In particular, as a base case, the game should be kicked off from the state reset, since initially we don't hold any stock and we don't have any stock to sell neither. Therefore, we assign the initial values of sold[-1] and held[-1] to be Integer.MIN_VALUE, which are intended to render the paths that start from these two states impossible.

As one might notice in the above formulas, in order to calculate the value for each array, we reuse the intermediate values, and this is where the paradigm of dynamic programming comes into play.

More specifically, we only need the intermediate values at exactly one step before the current step. As a result, rather than keeping all the values in the three arrays, we could use a sliding window of size 1 to calculate the value for max(sold[n],reset[n])\max{(\text{sold}[n], \text{reset}[n])}.

In the following animation, we demonstrate the process on how the three arrays are calculated step by step.

Current
1 / 7

As a byproduct of this algorithm, not only would we obtain the maximal profits at the end, but also we could recover each action that we should perform along the path, although this is not required by the problem.

In the above graph, by starting from the final state, and walking backward following the path, we could obtain a sequence of actions that leads to the maximal profits at the end, i.e. [buy, sell, cooldown, buy, sell].

Complexity Analysis

  • Time Complexity: O(N)\mathcal{O}(N) where NN is the length of the input price list.

    • We have one loop over the input list, and the operation within one iteration takes constant time.
  • Space Complexity: O(1)\mathcal{O}(1), constant memory is used regardless the size of the input.


Approach 2: Yet-Another Dynamic Programming

Intuition

Most of the times, there are more than one approaches to decompose the problem, so that we could apply the technique of dynamic programming.

Here we would like to propose a different perspective on how to model the problem purely with mathematical formulas.

Again, this would be a journey loaded with mathematical notations, which might be complicated, but it showcases how the mathematics could help one with the dynamic programming (pun intended).

Definition

For a sequence of prices, denoted as price[0,1,...,n]\text{price}[0, 1, ..., n], let us first define our target function called MP(i)\text{MP}(i). The function MP(i)\text{MP}(i) gives the maximal profits that we can gain for the price subsequence starting from the index ii, i.e. price[i,i+1,...,n]\text{price}[i, i+1, ..., n].

Given the definition of the MP(i)\text{MP}(i) function, one can see that when i=0i=0 the output of the function, i.e. MP(0)\text{MP}(0), is exactly the result that we need to solve the problem, which is the maximal profits that one can gain for the price subsequence of price[0,1,...,n]\text{price}[0, 1, ..., n].

Suppose that we know all the values for MP(i)\text{MP}(i) onwards until MP(n)\text{MP}(n), i.e. we know the maximal profits that we can gain for any subsequence of price[k...n]k[i,n]\text{price}[k...n] \quad k \in [i, n].

Now, let us add a new price point price[i1]\text{price}[i-1] into the subsequence price[i...n]\text{price}[i...n], all we need to do is to deduce the value for the unknown MP(i1)\text{MP}(i-1).

Up to this point, we have just modeled the problem with our target function MP(i)\text{MP}(i), along with a series of definitions. The problem now is boiled down to deducing the formula for MP(i1)\text{MP}(i-1).

In the following section, we will demonstrate how to deduce the formula for MP(i1)\text{MP}(i-1).

Deduction

With the newly-added price point price[i1]\text{price}[i-1], we need to consider all possible transactions that we can do to the stock at this price point, which can be broken down into two cases:

  • Case 1): we buy this stock with price[i1]\text{price}[i-1] and then sell it at some point in the following price sequence of price[i...n]\text{price}[i...n]. Note that, once we sell the stock at a certain point, we need to cool down for a day, then we can reengage with further transactions. Suppose that we sell the stock right after we bought it, at the next price point price[i]\text{price}[i], the maximal profits we would gain from this choice would be the profit of this transaction (i.e. price[i]price[i1]\text{price}[i] - \text{price}[i-1]) plus the maximal profits from the rest of the price sequence, as we show in the following:

example of profit calculation

In addition, we need to enumerate all possible points to sell this stock, and take the maximum among them. The maximal profits that we could gain from this case can be represented by the following:

C1=max{k[i,n]}(price[k]p[i1]+MP(k+2)) C_1 = \max_{\{k \in [i, n]\}}\big( \text{price}[k] - \text{p}[i-1] + \text{MP}(k+2) \big)

  • Case 2): we simply do nothing with this stock. Then the maximal profits that we can gain from this case would be MP(i)\text{MP}(i), which are also the maximal profits that we can gain from the rest of the price sequence.

C2=MP(i) C_2 = \text{MP}(i)

By combining the above two cases, i.e. selecting the max value among them, we can obtain the value for MP(i1)\text{MP}(i-1), as follows:

MP(i1)=max(C1,C2) \text{MP}(i-1) = \max(C_1, C_2)

MP(i1)=max(max{k[i,n]}(price[k]price[i1]+MP(k+2)),MP(i)) \text{MP}(i-1) = \max\Big(\max_{\{k \in [i, n]\}}\big( \text{price}[k] - \text{price}[i-1] + \text{MP}(k+2) \big), \quad \text{MP}(i) \Big)

By the way, the base case for our recursive function MP(i)\text{MP}(i) would be MP(n)\text{MP}(n) which is the maximal profits that we can gain from the sequence with a single price point price[n]\text{price}[n]. And the best thing we should do with a single price point is to do no transaction, hence we would neither lose money nor gain any profit, i.e. MP(n)=0\text{MP}(n) = 0.

The above formulas do model the problem soundly. In addition, one should be able to translate them directly into code.

Algorithm

With the final formula we derived for our target function MP(i)\text{MP}(i), we can now go ahead and translate it into any programming language.

  • Since the formula deals with subsequences of price that start from the last price point, we then could do an iteration over the price list in the reversed order.

  • We define an array MP[i] to hold the values for our target function MP(i)\text{MP}(i). We initialize the array with zeros, which correspond to the base case where the minimal profits that we can gain is zero. Note that, here we did a trick to pad the array with two additional elements, which is intended to simplify the branching conditions, as one will see later.

  • To calculate the value for each element MP[i], we need to look into two cases as we discussed in the previous section, namely:

    • Case 1). we buy the stock at the price point price[i], then we sell it at a later point. As one might notice, the initial padding on the MP[i] array saves us from getting out of boundary in the array.

    • Case 2). we do no transaction with the stock at the price point price[i].

  • At the end of each iteration, we then pick the largest value from the above two cases as the final value for MP[i].

  • At the end of the loop, the MP[i] array will be populated. We then return the value of MP[0], which is the desired solution for the problem.

Complexity Analysis

  • Time Complexity: O(N2)\mathcal{O}(N^2) where NN is the length of the price list.

    • As one can see, we have nested loops over the price list. The number of iterations in the outer loop is NN. The number of iterations in the inner loop varies from 11 to NN. Therefore, the total number of iterations that we perform is i=1Ni=N(N+1)2 \sum_{i=1}^{N} i = \frac{N\cdot(N+1)}{2}.

    • As a result, the overall time complexity of the algorithm is O(N2)\mathcal{O}(N^2).

  • Space Complexity: O(N)\mathcal{O}(N) where NN is the length of the price list.

    • We allocated an array to hold all the values for our target function MP(i)\text{MP}(i).

Report Article Issue

Comments: 26
zsezdl's avatar
Read More

This is a hard question for me.

138
Show 2 replies
Reply
Share
Report
mgibbs's avatar
Read More

This should be marked as hard. I don't think I would want to work for a company that asks a question like this in an interview. Absolutely insane difficulty for an interview question.

54
Show 2 replies
Reply
Share
Report
Frankxiao08's avatar
Read More

The explanation is quite clear! Great Job! At first, I don't want to read it because it's so long, but once I start to read, it takes only about 5 minutes for me finish solution 1. 3 stages is brilliant!

18
Reply
Share
Report
nostradamus29's avatar
Read More

Excellent explanation reg the states. But how am I supposed to come with that during an interview?

8
Show 2 replies
Reply
Share
Report
alexbbb's avatar
Read More

Explanation to Approach 1 (especially state machine and that table/graph-ish diagram) is very good.

5
Reply
Share
Report
AvidReader's avatar
Read More

If anyone knows few other problems, which can also be modeled and solved with help of state machine, please let me know. I really loved the "state machine" perspective given in this problem.

5
Show 1 reply
Reply
Share
Report
Hoffles's avatar
Read More

Excellent state machine explanation. Perhaps it would be helpful if the first qn in the series (Best Time to Buy and Sell Stocks) could have this type of explanation in the solutions. So when people progress through this qn series they can build on their state machine knowledge.

4
Show 1 reply
Reply
Share
Report
zyg3328's avatar
Read More

what if the cooldown time is two days? How should we solve this issue in State Machine?

2
Show 1 reply
Reply
Share
Report
yaroslav-repeta's avatar
Read More

I think, the second solution can be simplified to O(N).
If prices[i] >= prices[i + 1], there is no way MP[i] > MP[i + 1]
If prices[i] < pprices[i] < prices[i + 1], then there are two possible ways MP[i] > MP[i + 1]:

  1. prices scope increase, so we can do more operations than previously, so buy by prices[i], sell by prices[i + 1], + MP[i + 3] (max after cooldown)
  2. If for MP[i + 1], the first buy was with using price[z] and if prices[i] < price[z], then buy by price[i] instead of buying by prices[z]
z = L - 1
for i in range(L-1, -1, -1):
  MP[i] = MP[i + 1]
  if i < L - 1 and prices[i] < prices[i + 1]:
    MP[i] = max(MP[z] + prices[z] - prices[i], prices[i + 1] - prices[i] + MP[i + 3], MP[i])
    if MP[i] > MP[i + 1]:
      z = i

And to O(1) space, if keep only last three MP values and MP[z]

MP_z = 0
MP = deque([0] * 3)

z = L - 1
for i in range(L-1, -1, -1):
  MP.appendleft(MP[0])
  if i < L - 1 and prices[i] < prices[i + 1]:
    MP[0] = max(MP_z + prices[z] - prices[i], prices[i + 1] - prices[i] + MP[3], MP[0])
    if MP[0] > MP[1]:
      z = i
      MP_z = MP[0]
  MP.pop()
return MP[0]

2
Show 1 reply
Reply
Share
Report
sasawq21's avatar
Read More

Can someone explain why held[0] and sold[0] are INT_MIN and reset[0] is 0 ?

3
Show 2 replies
Reply
Share
Report
Time Submitted
Status
Runtime
Memory
Language
07/29/2020 13:05Accepted8 ms11.5 MBcpp
/23
Contribute
|||
Saved
Trie
This list is empty.